PET-validated EEG-machine learning algorithm predicts brain amyloid pathology in pre-dementia Alzheimer’s disease

Developing reliable biomarkers is important for screening Alzheimer’s disease (AD) and monitoring its progression. Although EEG is non-invasive direct measurement of brain neural activity and has potentials for various neurologic disorders, vulnerability to noise, difficulty in clinical interpretation and quantification of signal information have limited its clinical application. There have been many research about machine learning (ML) adoption with EEG, but the accuracy of detecting AD is not so high or not validated with Aβ PET scan. We developed EEG-ML algorithm to detect brain Aβ pathology among subjective cognitive decline (SCD) or mild cognitive impairment (MCI) population, and validated it with Aβ PET. 19-channel resting-state EEG and Aβ PET were collected from 311 subjects: 196 SCD(36 Aβ +, 160 Aβ −), 115 MCI(54 Aβ +, 61Aβ −). 235 EEG data were used for training ML, and 76 for validation. EEG features were standardized for age and sex. Multiple important features sets were selected by 6 statistics analysis. Then, we trained 8 multiple machine learning for each important features set. Meanwhile, we conducted paired t-test to find statistically different features between amyloid positive and negative group. The best model showed 90.9% sensitivity, 76.7% specificity and 82.9% accuracy in MCI + SCD (33 Aβ +, 43 Aβ −). Limited to SCD, 92.3% sensitivity, 75.0% specificity, 81.1% accuracy (13 Aβ +, 24 Aβ −). 90% sensitivity, 78.9% specificity and 84.6% accuracy for MCI (20 Aβ +, 19 Aβ −). Similar trends of EEG power have been observed from the group comparison between Aβ + and Aβ −, and between MCI and SCD: enhancement of frontal/ frontotemporal theta; attenuation of mid-beta in centroparietal areas. The present findings suggest that accurate classification for beta-amyloid accumulation in the brain based on QEEG alone could be possible, which implies that QEEG is a promising biomarker for beta-amyloid. Since QEEG is more accessible, cost-effective, and safer than amyloid PET, QEEG-based biomarkers may play an important role in the diagnosis and treatment of AD. We expect specific patterns in QEEG could play an important role to predict future progression of cognitive impairment in the preclinical stage of AD. Further feature engineering and validation with larger dataset is recommended.

www.nature.com/scientificreports/ We randomly selected 25% of subjects (N = 76; 13 SCD with Aβ, 24 SCD without Aβ, 20 MCI with Aβ, and 19 MCI without Aβ) to exclude from the model training and use them for subsequent verification. Since the ratio of Aβ + to Aβ − data was markedly skewed, being approximately 1:2.5, the positive data were doubled to balance the training data set. As an augmentation method, each of the Aβ + data were divided into the first half and the second half and treated as two separate sets. For example, if the duration of EEG data is 100 s, we make one dataset from 0 to 50 s, and the other from 50 to 100 s. It is a well-used method for data augmentation in machine learning or deep learning using biological signals 39 .
The SCD inclusion criteria were as follows: (1) persistent subjective complaints of cognitive decline, (2) ≥ 60 years of age, (3) at least 6 years of primary school, (4) a memory test standard score 0 to − 1.5 standard deviations (SD), and the other cognitive tests > − 1.5 SD, and (5) informed consent of the participant. Individuals showing a standard score < − 1.5 SD on any other cognitive test were excluded because of the possibility of MCI. This is a commonly used standard in Korean AD society 40 . We defined MCI in accordance with Petersen's criteria, which presumes MCI in individuals with objective memory impairment for their age (standard score < − 1.5 SD), but normal performance in activities of daily living (ADL) 41 .
Amyloid PET scans were performed to detect Aβ plaques in the brain. The standardized uptake value ratio (SUVR) was used to quantify cortical Aβ, which was normalized to the cerebellar gray matter. 18 F-florbetaben PET images were acquired and processed by the precedent procedure. Individual 3D T1-weighted magnetic resonance (MR) images were preprocessed and co-registered into the corresponding PET images. The MR images normalized to a standardized stereotaxic space were divided into three probabilistic tissue maps composed of gray matter, white matter, and CSF. A volume-based template of 90 regions of interest was aligned to the individual MR image. The SUVR was calculated using whole voxels of 18 F-florbetaben PET images referenced to the cerebellum. The global SUVR was estimated by averaging 90 regional uptake values. All PET images were interpreted by nuclear medicine physicians who were blinded to the neuropsychological tests and classifications and dichotomized the images as amyloid-positive or negative using visual reads. The PET images were interpreted only by readers who successfully completed the electronic training program provided by the manufacturer. EEG recording, feature generation and selection. All subjects were instructed to relax with their eyes closed and to refrain from movement and talking. EEG data were recorded (bandpass: 0.1-45.5 Hz; Natus Nicolet EEG v32, Nihon Kohden JE921A and Grass AS40) in the resting-state, eyes-closed condition from 19 scalp electrodes positioned over the whole head according to the International 10-20 System (Fp1, Fp2, F7, F3, Fz, F4,  F8, T3, C3, Cz, C4, T4, T5, P3, Pz, P4, T6, O1, O2). A linked-ear reference electrode was noted if present, but not deemed mandatory for the present study, to respect standard internal protocols of the several clinical recording units. In each case, a ground electrode was located between the Afz and Fz electrodes. Electrode impedance was kept below 10 kOhm. All recorded artifact-free EEG data were re-referenced offline to a common average montage to harmonize the EEG data collected using different reference electrodes. All data were digitalized in continuous recording mode (approximately 3 min of EEG; sampling rate: 200 or 250 Hz, to avoid aliasing).
EEG preprocessing was performed to denoise each data to minimize the effects of artifacts. The first stage of EEG preprocessing involved sampling the signals at 250 Hz and filtering them with a bandpass filter in the 1 ~ 45.5 Hz range. After passing the EEG via a notch filter, it was prepared for downstream processing, which included re-referencing (using CAR), bad epoch rejection (using ASR), and removing stationary noise by adaptive mixture independent component analysis (AMICA). Finally, artifacts identified via electromyogram (EMG) and electrooculogram (EOG) were removed to generate QEEG data. All EEG preprocessing processes, sensorlevel data, source-level data calculation and extraction were performed using a cloud-based EEG analyzing platform (iSyncBrain ® , iMediSync, Inc. Korea; https:// isync brain. com).
To successfully predict morbidity, among the many features associated with Aβ + EEG data, we selected those in which the change was most noticeable in the channel, frequency, or functional connectivity. In many previous machine learning studies, features were selected using various statistical analysis methods to identify those showing significant differences between the target group and the control group. Firstly, we conducted paired t-test to find statistically different features between amyloid positive and negative group. To see the correlation between the two groups, a normality test was performed for each group. Then, Pearson correlation was applied if both groups followed normality. The methods used in the normality test were Shapiro-Wilks test and Kolmogorov-Smirnov test. If either method satisfies normality, we considered that normality was satisfied. If both groups didn't followed normality, Spearman's rank correlation test was applied. Additionally, features were identified using 6 methods (T-test, ElasticNet, Whitney-Mann, Random Forest Importance, GBM, XGBoost). MATLAB and Statistics Toolbox R2018b (The MathWorks Inc, Natick, MA, USA) and iSyncBrain ® (iMediSync, Inc. Korea; https:// isync brain. com) were used for statistical analyses. In order to find purely amyloid-specific features and exclude those associated with cognitive impairment, we conducted the same feature selection procedure as previously described to find most significantly distinctive features between SCD and MCI. Then we kept those features from the model-building process so that the final model would consist of pure amyloid-specific features independent of cognitive impairment-related EEG features among MCI or SCD.
Machine learning modeling and validation. The significant feature sets obtained by the feature selection procedure were then entered into six representative machine learning models, (SVM, Logistic, KNN, Naive Bayes, Random Forest, AdaBoost, GBM and XGBoost), yielding 48 models (6 sets * 8 algorithms). Then the accuracy was calculated by five-layer cross-validation. The feature set that showed more than a predetermined threshold level of accuracy in the above verification was used in the production of this model as an important feature set. MATLAB and Machine Learning Toolbox Rf2018b (The MathWorks Inc, Natick, MA, USA) were used for machine learning. Seventy-six EEGs were used as independent test data set to validate their classification performance. They consisted of 13 SCD Aβ +, 24 SCD Aβ −, 20 MCI Aβ + and 19 MCI Aβ − that had been set aside for validation. The final performance of each model was determined by the results of the test validation. The models with the highest accuracy were selected as the final model candidates. Figure 1 schematically represents the procedure of building classification models from feature generation to feature selection and model validation.

Results
Group comparison and modeling process. Figure 2 represents the group differences in EEG between amyloid positive and negative were calculated by t-test. We found distinct EEG power patterns between Aβ + and Aβ − in both SCD and MCI. The existence of Aβ plaques presented stronger relative theta power [4][5][6][7][8] at the middle frontal area and weaker beta2 power [15][16][17][18][19][20] at both centroparietal areas. When Aβ + MCI was compared to Aβ + SCD, these patterns became more obvious. Figure 3 shows the accuracy of models trained with multiple feature sets and multiple ML algorithms. We found various ML algorithms resulting in random classification performances with the use of the same feature set. Figure 4 shows the 3rd party data classification accuracy of final model for SCD + MCI group, SCD only group and MCI only group.
Model validation with 3rd party dataset. We validated the trained ML model with a total of 76 independent datasets that were not included in the training of ML, and consisted of 13 Aβ + SCD, 24 Aβ − SCD, 20 Aβ + MCI and 19 Aβ − MCI). The best classification model showed 90.9% sensitivity, 76.7% specificity and 82.9% accuracy when it was applied to the total of the SCD + MCI group. When it was applied to the SCD only group, sensitivity was 92.3%, specificity was 75.0 and accuracy was 81.1%. For the MCI only group, 90% sensitivity, 78.9% specificity and 84.6% accuracy were shown. Table 2 presents the confusion matrix. We found overfitting occurs repeatedly during deep learning training, the best validation accuracy was 63.9% in ResNet (Fig. 5). Its 3rd data test showed 60.5% accuracy, 63.6% sensitivity and 58.1 specificity.

Discussion
We collected resting state eyes-closed EEG from SCI or MCI populations, each of which also underwent amyloid PET analysis for differential diagnosis of AD. The existence of AD among the SCI or MCI population can significantly affect the further progression of cognitive impairment. Early-stage AD, especially when it is at the prodromal stage, is very difficult to detect, and there is no cost-effective biomarker to discriminate the existence of preclinical AD. Although a few previous studies have shown promising results in differentiating normal, MCI or Alzheimer's dementia by applying various ML algorithms with EEG features, there is no PET-validated study for an EEG-based ML algorithm predicting the existence of brain Aβ+ plaque among SCI or MCI with no less than 80% accuracy, except our previous study 43 . In our previous study, we presented distinct spatial (e.g. posterior cortex) and frequency features (e.g. delta to alpha) of the resting-state QEEG rhythms between Aβ + and Aβ − group, respectively, both in MCI and SCD, and then developed separate ML classification algorithms for Aβ in MCI and SCD individually since we couldn't get a sufficient number of datasets.
There have been several former trials to confirm the presence of beta-amyloid using resting state EEG alone. However, none of them achieved classification accuracy above 80%. One reason that we could improve the performance is the process of searching best combination of features set and algorithm. There are multiple methods to find good features set, and multiple algorithms to build predictive models. We found the combination of features set and algorithm is important component for performance. As can be seen from Fig. 3, even in a same feature set, performances are fluctuating depending on algorithm adopted. For example, the features set which is obtained by Random Forest showed poor performance with SVM, but showed great performance with XGBoost. In addition, we think traditional machine learning algorithms are powerful enough, but there still remains room for the feature engineering. By trying every combination of features sets and algorithms, we believe we could find the better, hidden model. In previous studies, they could not adjust for the effect of age and sex on the selected EEG features, or effects of cognitive impairment-related EEG changes, which could be the major confounding factors to Aβ-specific EEG features since a relative increase in slow waves and decrease in fast waves are easily observed in normal ageing, neurodegeneration or cognitive impairments. Biases in key features that arise from differences in sex, and age-associated cognitive decline may negatively impact the classification performance of the AI-based algorithms. Furthermore, due to the fact that most patients visiting clinical institutions that suffer from cognitive decline are old-aged populations. In order to overcome such limitations, all selected features were standardized into z-score with the employment of the only age-and sex-differentiated EEG normative database on iSyncBrain ®44 . This standardization process could contribute to the improvement of modeling accuracy by the rejection of age-or sex-related EEG changes. www.nature.com/scientificreports/ cognitive impairment could also be an important factor that contributes to the enhancement of the algorithm's accuracy, making it more robust and specific to Aβ plaque. In previous studies, various ML algorithms were applied to the same selected features to find the best model. However, there were no studies applying multiple  . These similar validation results between SCD and MCI means that our ML algorithm is specific for the detection of Aβ pathology regardless of cognitive impairment. In addition, previous studies involved data being collected from a single institution or 2-3 institutions, whilst our study is cohort-based. Hence, we firmly believe that our model is more general, and robust in comparison. Meanwhile, we couldn't get a satisfying result from deep learning. We thought the biggest reason for low performance was the lack of data. Another reason was limited computation power. Since we were not equipped   AlexNet. In addition, in the process of converting EEG, a high-dimensional data, into an image for deep learning, a lot of information such as connectivity and power ratio couldn't be expressed but only sensor power features.
We conducted several preliminary studies to detect the specific EEG patterns related to AD and observed the characteristics of EEG patterns as being similar to this study where relative theta increased in the frontal or temporal area and decreased in mid-beta [12 ~ 20 Hz] in both centroparietal areas. It was demonstrated that the Aβ + SCD group showed stronger theta activity in the frontotemporal area compared to the Aβ-SCD group, and this pattern was more prominent in the APOE ε4 allele (+) subgroup 31 . A similar patten was enhanced when AD dementia was compared to the non-dementia AD group 32 . Based on these observations, we successfully developed an ML-based classification algorithm to discriminate amnestic MCI from SCD or normal ageing 33 . Various prior studies also repeatedly reported that the enhancement of theta in frontal or frontotemporal and attenuation of mid-beta in centroparietal areas in resting-state EEG were related to MCI or ADD [45][46][47][48][49][50] . Furthermore, these patterns are not limited to showing the current disease state, but also related to predicting the future progression of cognitive impairment in the preclinical stage of AD. In a prospective study to find predictive factors for the progression of cognitive impairment in amyloid PET-confirmed preclinical AD, the group that progressed from SCD to MCI showed more significant enhancement of theta in midfrontal or temporal areas compared to the stable group. A relative weakening pattern of beta waves in both centroparietal areas was also observed in the progressive group of SCD or MCI 51 .
From the observations in this and other previous studies, we assumed that the enhancement of frontal theta beyond age-related physiological increase could be an EEG biomarker surrogate for neurodegenerative pathology such as Aβ plaque or related cognitive impairment. On the other hand, net-neurocognitive performance isn't determined only by neurodegenerative pathology, but is also affected by cognitive reserve. Weakened neural compensation is related to attenuated cognitive function, especially in AD 52 . We hypothesize that bilateral enhancement of beta waves in the centroparietal area could be related to neural compensatory activity in ageing or neurodegenerative disease, since this pattern is obvious in preclinical AD, but becomes weak and disappears in AD MCI or ADD. This beta wave range [12][13][14][15][16][17][18][19][20] is related to alertness or active information processing and bilateral enhancement of this frequency band isn't observed in healthy young adults. We will have to conduct further studies to validate the meaning of this phenomenon and relation with neural compensation or cognitive reserve, which could be tentative targets of cognitive enhancement intervention or neuromodulation. Aβ plaque is the main pathology of AD and the diagnostic biomarker of it. However, the quantity of Aβ plaque is weakly correlated with AD-related cognitive impairment, so a decrease in Aβ plaque in amyloid PET can't predict the prevention of further cognitive impairment or recovery of cognitive function well. Although EEG patterns or ML algorithm for AD are not the direct biological measurement of Aβ plaque, they are a specific endophenotype of underlying Aβ plaque, and could be a surrogate marker for AD-related cognitive functional changes. There have been many efforts to find easily accessible, cost-effective, and noninvasive biomarkers for AD. Compared to various peripheral tentative biomarkers, EEG is the only direct, noninvasive measure of the brain's real-time neuroelectric activities closely synchronized to emotional, cognitive and behavioral status, or functional brain activities. Advances in computational neuroscience, AI/ML technology and cloud computing have uncovered a large amount of previously invisible information in EEG data and improved its usability in clinical situations. www.nature.com/scientificreports/ To summarize the novelty and advantages of our study, although there have been various EEG-based prediction models for Alzheimer's disease, our study utilized 300 people's data with amyloid PET results as ground truth in the training of AI models for the first time. The EEG data utilized in our study have been obtained through collaboration of several clinical institutions and universities, allowing the model to be more general and stable. Moreover, through the application of gender-and age-standardized z-score, our prediction model holds robustness to gender-and age-specific differences. On the other hand, the optimal feature set and algorithm were established through combining multiple statistical techniques and varying algorithms, rather than using a single statistical technique or learning algorithm. We expect our classification model to make significant contributions in categorization of dementia or cognitive impairments through predicting amyloid PET results using EEG data. In particular, our model aids segmenting various dementia pathologies and understanding the trends in neurophysiological signal-based longitudinal studies.
Our research suggests that EEG can be useful in early screening of AD and its progression prior to undertaking complex and expensive AD diagnosis procedures, which also contributes to reducing the cost burden on modern society. Our model particularly aids in early-screening for the patients that visit primary and secondary medical institutions that lack in-depth examination methods. Given the recent partial approvement of Aducanumab as a therapeutic drug that targets Amyloid plaques, it is crucial to build foundations for future use and improvements. Our study aims to contribute towards this through the presented amyloid prediction model.

Limitation
The accuracy of AI/ML models can be enhanced by additional training datasets, more specifically selected and processed features, and the application of advanced algorithms. In this study, we tried to recruit enough subjects for the training dataset. However, feature selection and ML modeling was limited to a data-driven approach. Additional training datasets, more robust feature engineering technology and advanced ML algorithms are necessary to upgrade our algorithm. Then the stability of classification should be continuously tested with bigger datasets recorded with different EEG equipment.
Our model is solely based on physiological characteristics, disregarding conventional clinical tests such as MMSE, APoE4 and MRI. Current model is mono modality-based, and its robustness may be improved through developing multi-modality-based models. For instance, heart rate variability (HRV) characteristics that describe the physiological biases in autonomic nervous system can be considered, as well as other clinical metadata.
Moreover, recent studies report that not the entirety of clinically diagnosed Alzheimer's disease show positive Amyloid PET results 53,54 . Since our study utilized clinically diagnosed Alzheimer's disease data which also exhibit positive Amyloid PET results, the model may misclassify Alzheimer's disease data with negative Amyloid PET results and vice versa.
The relationship between midfrontal theta power or bilateral centroparietal beta power and disease progression or cognitive impairment should be further investigated. There is individual variability among the AD population in EEG patterns, progressiveness, and cognitive function, so we have to verify the existence of different EEG endophenotypes and their relationship with progression or drug responsiveness in AD. Explanation of AI/ML is especially important in medicine to understand pathophysiological mechanisms or explore reversible factors that could be tentative intervention targets. We will develop an explainable deep learning algorithm that should be more accurate and have more specific information about regions of interest, spectral power, functional connectivity and temporal dynamics with additional EEG datasets for AD.

Conclusion
It is expected that AI/ML-empowered digitalized EEG could work as a cost-effective, easily accessible, and repetitively measurable CNS digital biomarker for AD screening, monitoring, especially when it is combined with easy-to-use wearable EEG devices. It could work for companion diagnostics in AD drug development, contributing to saving time and cost in screening feasible participants and finding the best responders to new interventions based on EEG endophenotypes.
In this study, once deep learning techniques was tried, but the results were not good and the reason is considered to be the lack of data and computational power, and the loss of information in the process of image converting. As a result, classic machine learning techniques were adopted. We believe that the significance of this study is that even existing algorithms can improve performance through feature engineering. In the future study, we will gather more data, and at the same time, we will try various techniques for data augmentation, and on the one hand, we will continue feature engineering and fine tuning of machine learning. On the other hand, we will reinforce the machine and apply the latest deep learning with bigger dataset.
Moreover, we would like to confirm if the existence of different EEG endophenotypes and their relationship with progression or drug responsiveness in AD. Accordingly we will design a longitudinal study to trace subject participated on this study.
Meanwhile, we will analyze the repeatedly misclassified data and find if there is any common pattern, which can be lead to a finding of new endophenotype of Alzheimer's disease. We expect our classification model to make significant contributions in categorization of dementia or cognitive impairments through predicting amyloid PET results using EEG data. In particular, our model aids segmenting various dementia pathologies and understanding the trends in neurophysiological signal-based longitudinal studies 55,56 .

Data availability
The datasets generated and/or analyzed during the current study are not publicly available due to data protection regulations, but are accessible at the corresponding author on reasonable request. www.nature.com/scientificreports/ Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.